Arbitrary Precision Arithmetic - SIMD Style

نویسندگان

  • S. Balakrishnan
  • S. K. Nandy
چکیده

Current day general purpose processors have been enhanced with what is called " media instruction set " t o achieve performance gains in applications that are media processing intensive. The instruction set that have been added exploit the fact that media applications have small native datatypes and have widths much less than that supported by commercial processors and the plethora of data-parallelism in such applications. Current processors enhanced with the " media instruction set " support arithmetic on sub-datatypes of only &bit, 16-bit7 32-bit and 64-bit precision. In this paper we motivate the need f o r arbitrary precision packed arithmetic wherein the width of the sub-datatypes are pro-grammable by the user and propose an implementation f o r arithmetic o n such packed datatypes. The proposed scheme has marginal hardware overhead over conventional implementations of arithmetic o n processors incorporating a multimedia extended instruction set.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Integer and Rational Arithmetic on MasPar

The speed of integer and rational arithmetic increases significantly by systolic implementation on a SIMD architecture. For multiplication of integers one obtains linear speed-up (up to 29 times), using a serial{parallel scheme. A two-dimensional algorithm for multiplication of polynomials gives half-linear speed-up (up to 383 times). We also implement multiprecision rational arithmetic using k...

متن کامل

Efficient arithmetic on ARM-NEON and its application for high-speed RSA implementation

Advanced modern processors support Single Instruction Multiple Data (SIMD) instructions (e.g. Intel-AVX, ARM-NEON) and a massive body of research on vector-parallel implementations of modular arithmetic, which are crucial components for modern public-key cryptography ranging from RSA, ElGamal, DSA and ECC, have been conducted. In this paper, we introduce a novel Double Operand Scanning (DOS) me...

متن کامل

Montgomery Multiplication on the Cell

A technique to speed up Montgomery multiplication targeted at the Synergistic Processor Elements (SPE) of the Cell Broadband Engine is proposed. The technique consists of splitting a number into four consecutive parts. These parts are placed one by one in each of the four element positions of a vector, representing columns in a 4-SIMD organization. This representation enables arithmetic to be p...

متن کامل

Montgomery Modular Multiplication on ARM-NEON Revisited

Montgomery modular multiplication constitutes the “arithmetic foundation” of modern public-key cryptography with applications ranging from RSA, DSA and Diffie-Hellman over elliptic curve schemes to pairing-based cryptosystems. The increased prevalence of SIMD-type instructions in commodity processors (e.g. Intel SSE, ARM NEON) has initiated a massive body of research on vector-parallel implemen...

متن کامل

Kestrel: Design of an 8-bit SIMD Parallel Processor

Kestrel is a high-performance programmable parallel co-processor. Its design is the result of examination and reexamination of algorithmic, architectural, packaging, and silicon design issues, and the interrelations between them. The nal system features a linear array of 8-bit processing elements, each with local memory, an arithmetic logic unit (ALU), a multiplier, and other functional units. ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1998